TF-IDuF: A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections
نویسندگان
چکیده
TF-IDF is one of the most popular term-weighting schemes, and is applied by search engines, recommender systems, and user modeling engines. With regard to user modeling and recommender systems, we see two shortcomings of TF-IDF. First, calculating IDF requires access to the document corpus from which recommendations are made. Such access is not always given in a user-modeling or recommender system. Second, TF-IDF ignores information from a user’s personal document collection, which could – so we hypothesize – enhance the user modeling process. In this paper, we introduce TFIDuF as a term-weighting scheme that does not require access to the general document corpus and that considers information from the users’ personal document collections. We evaluated the effectiveness of TF-IDuF compared to TF-IDF and TF-Only and found that TF-IDF and TF-IDuF perform similarly (clickthrough rates (CTR) of 5.09% vs. 5.14%), and both are around 25% more effective than TF-Only (CTR of 4.06%) for recommending research papers. Consequently, we conclude that TF-IDuF could be a promising term-weighting scheme, especially when access to the document corpus for recommendations is not possible, and thus classic IDF cannot be computed. It is also notable that TF-IDuF and TF-IDF are not exclusive, so that both metrics may be combined to a more effective term-weighting scheme.
منابع مشابه
A novel term weighting scheme based on discrimination power obtained from past retrieval results
Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term’s role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particula...
متن کاملMind-Map Based User Modeling and Research Paper Recommendations
Mind-maps have not received much attention in the user modeling and recommender system community, although they contain lots of information that could be valuable for user modeling and recommender systems. For this paper, we explored the effectiveness of standard user modeling approaches applied to mind-maps, and developed novel user modeling approaches that consider the unique characteristics ...
متن کاملA Learning-Based Term-Weighting Approach for Information Retrieval
One of the core components in information retrieval(IR) is the document-term-weighting scheme. In this paper,we will propose a novel learning-based term-weighting approach to improve the retrieval performance of vector space model in homogeneous collections. We first introduce a simple learning system to weighting the index terms of documents. Then, we deduce a formal computational approach acc...
متن کاملInvestigating the Similarity Space of Music Artists on the Micro-Blogosphere
Microblogging services such as Twitter have become an important means to share information. In this paper, we thoroughly analyze their potential for a key challenge in the field of MIR, namely the elaboration of perceptually meaningful similarity measures. To this end, comprehensive evaluation experiments were conducted using Twitter posts gathered during a period of several months. We investig...
متن کاملModeling Users for Adaptive Information Retrieval by Capturing User Intent
In this chapter, we study and present our results on the problem of employing a cognitive user model for Information Retrieval (IR) in which a user’s intent is captured and used for improving his/her effectiveness in an information seeking task. The user intent is captured by analyzing the commonality of the retrieved relevant documents. The effectiveness of our user model is evaluated with reg...
متن کامل